Orange4WS Environment for Service-Oriented Data Mining
نویسندگان
چکیده
Novel data-mining tasks in e-science involve mining of distributed, highly heterogeneous data and knowledge sources. However, standard data mining platforms, such as Weka and Orange, involve only their own data mining algorithms in the process of knowledge discovery from local data sources. In contrast, next generation data mining technologies should enable processing of distributed data sources, the use of data mining algorithms implemented as web services, as well as the use of formal descriptions of data sources and knowledge discovery tools in the form of ontologies, enabling automated composition of complex knowledge discovery workflows for a given data mining task. This paper proposes a novel Service-oriented Knowledge Discovery framework and its implementation in a service-oriented data mining environment Orange4WS (Orange for Web Services), based on the existing Orange data mining toolbox and its visual programming environment, which enables manual composition of data mining workflows. The new service-oriented data mining environment Orange4WS includes the following new features: simple use of web services as remote components that can be included into a data mining workflow; simple incorporation of relational data mining algorithms; a knowledge discovery ontology to describe workflow components (data, knowledge and data mining services) in an abstract and machine-interpretable way, and its use by a planner that enables automated composition of data mining workflows. These new features are showcased in three real-world scenarios.
منابع مشابه
Preventing Key Performance Indicators Violations Based on Proactive Runtime Adaptation in Service Oriented Environment
Key Performance Indicator (KPI) is a type of performance measurement that evaluates the success of an organization or a partial activity in which it engages. If during the running process instance the monitoring results show that the KPIs do not reach their target values, then the influential factors should be identified, and the appropriate adaptation strategies should be performed to prevent ...
متن کاملTOWARDS SEMANTIC DATA MININGWITH g-SEGS
This paper introduces the term semantic data mining to denote a data mining approach where domain ontologies are used as background knowledge for data mining. It is motivated by successful applications of SEGS (search for enriched gene sets), a system that uses biological ontologies as background knowledge to construct descriptions of interesting gene sets in experimental microarray data. We ge...
متن کاملDevelopment of a framework to evaluate service-oriented architecture governance using COBIT approach
Nowadays organizations require an effective governance framework for their service-oriented architecture (SOA) in order to enable them to use a framework to evaluate their current state governance and determine the governance requirements, and then to offer a suitable model for their governance. Various frameworks have been developed to evaluate the SOA governance. In this paper, a brief introd...
متن کاملA Distributed Approach to Extract High Utility Itemsets from XML Data
This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to iden...
متن کاملExploiting ontologies and higher order knowledge in relational data mining Doctoral Thesis
Present day knowledge discovery tasks require mining heterogeneous and structured data and knowledge sources. The key enabling factors for performing these tasks include efficient exploitation of knowledge about the domain of discovery and utilizing meta knowledge about the data mining process, which facilitates the construction of complex workflows consisting of highly specialized algorithms. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Comput. J.
دوره 55 شماره
صفحات -
تاریخ انتشار 2012